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Abstract: 


In this paper, we workout a detailed mathematical analysis for a new learning algorithm termed 
Cascade Error Projection (CEP) and a general learning frame work This frame work can be used to 
obtain the cascade correlation learning algorithm by choosing a particular set of parameters. 
Furthermore, CEP learning algorithm is operated only on one layer, whereas the other set of weights can 
be calculated deterministically In association with the dynamical stepsize change concept to convert the 
weight update from infinite space into a finite space, the relation between the current stepsize and the 
previous energy level is also given and the estimation procedure for optimal stepsize is used for validation 
of our proposed technique. 

The weight values of zero are used for starting the learning for every layer, and a single hidden 
unit is applied instead of using a pool of candidate hidden units similar to cascade correlation scheme. 
Therefore, simplicity in hardware implementation is also obtained. Furthermore, this analysis allows us 
to select from other methods (such as the conjugate gradient descent or the Newton 's second order) one 
of which will be a good candidate for the learning technique. The choice of learning technique depends 
on the constraints of the problem (eg., speed, performance, and hardware implementation); one 
technique may be more suitable than others. Moreover, for a discrete weight space, the theoretical 
analysis presents the capability of learning with limited weight quantization. Finally, 5- to 8-bit parity 
and chaotic time series prediction problems are investigated; the simulation results demonstrate that 4- 
bit or more weight quantization is sufficient for learning neural network using CEP. In addition, it is 
demonstrated that this technique is able to compensate for less bit weight resolution by incoporating 
additional hidden units. However, generation result may suffer somewhat with lower bit weight 
quantization. 

I-Introduction 


There are many ill-defined problems in pattern recognition, classification, 
vision, and speech recognition which need to be solved in real time [1-3]. One of the 
most attractive features of the neural network is a massively parallel processing 
topology that offers tremendous speed specially when implemented in hardware. 
Generally, neural network approaches in hardware face two main obstacles: 

(1) difficulty of network convergence due to the learning algorithm itself as well as 
the limited precision of the devices; 

(2) high cost of implementing hardware to truly mimic the synapse and neuron 
transfer functions dictated by the algorithm. 

Furthermore, the convergence and the implementable hardware have a mutual 
correlation to each other; for example, the convergence of the learning network depends 
on the weight resolution available in synapse [4-6], and the cost of implementation of 
each bit in synapse grows, at least doubly, in silicon area, power, and connectivity[7-8] 
In this paper, CEP learning algorithm is presented. It offers a simple learning 
method using a one-layer perceptron approach and a deterministic calculation for the 
other layer. Such a simple procedure offers a fast, reliable, and implementable learning 
algorithm. In addition, the learning technique is not only tolerant of 3- and 4-bit weight 
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Figure 2: The chart shows CEP learning capability and the 
number of hidden units required to correctly solve 5- to 8-bit 
parity problems using round-off technique, x axis represents 
weight quantization (3-6 and 64-bit) and y axis shows the 
resulting number of hidden units (limited to 20). Each 
learning hidden unit is provided with 100 epoch iterations. 
As shown, a lager number of hidden units compensate for 
the lower weight resolution. 


Chaotic Time Series Problem: 

The data in this problem represents chaos and never repeated. However, this data 
between past, present, and future are correlated in high order. To validate the capability 
of CEP as shown in theory, we use CEP learning technique under constraints of limited 
weight quantization (4-, 6-, and 64-bit weight resolution) to capture the high order 
correlation of this problem. 

In this experiment, we use jc/, xj + /, x/+2, x/+2 and the target is x/+^ . The number of 
training data is 351 and test data is 651 and no cross validating data is applied in this 
phase. 




(a) 
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Figure 3: Data sets of chaotic time series problem, (a), training set to the 
CEP neural network, and (b). Test set which has no overlap with training 
set. 


Test Results (4-, 6-, and 64-bit Weight Round-Off) 



Figure 4: Simulation Results of CEP for chaotic time series prediction 
problem. Top trace contains four curves: ideal data, 64-bit, 6-bit and 4-bit 
prediction results. Bottom trace contains : errors between ideal data and 
64-bit, 6-bit, and 4-bit generalization data. 

The results in Figure 4 show that the error between ideal data and prediction with 64-bit 
weight learning network is within +/-0.01 and is like white noise, whereas, 6-bit error is 
more harmonic than 4-bit error prediction. These results can be interpreted to infer that 
the more bit weight quantization is available for learning the better and smoother the 
transform would be. In addition, the better and smoother transformation will help 
network to interpolate for predictions. 

IV. Conclusions 

In this paper, we have shown that CEP is a reliable technique for both software- and 
hardware-based neural network learning. From this analysis, it is shown that the CC 
algorithm is a special case and can be understood in greater depth with this analysis. 
Moreover, the theoretical analysis provides us with the general framework of the learning 
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architecture, and the particular learning algorithm can be independently studied for its 
suitability for a given application associated with given constraints specific to each 
problem. For example, for hardware implementation CEP is advantageous, but for 
software, covariance or Newton’s second order method is more advantageous). For the 
CEP learning algorithm, the advantages can be summarized as follows: 

• A fast and reliable learning technique 

• A hardware implementable learning technique 

• Learning scheme is tolerant of lower weight resolutions. 

• A robust model in learning neural networks 
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